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Context. Two-point correlation functions are used throughout cosmology as a measure for the statistics of random fields. When used 
in Bayesian parameter estimation, their likelihood function is usually replaced by a Gaussian approximation. However, this has been 
shown to be insufficient. 

Aims. For the case of Gaussian random fields, we search for an exact probability distribution of correlation functions, which could 
improve the accuracy of future data analyses. 

Methods. We use a fully analytic approach, first expanding the random field in its Fourier modes, and then calculating the characteris- 
tic function. Finally, we derive the probability distribution function using integration by residues. We use a numerical implementation 
of the full analytic formula to discuss the behaviour of this function. 

Results. We derive the univariate and bivariate probability distribution function of the correlation functions of a Gaussian random 
field, and outline how higher joint distributions could be calculated. We give the results in the fom of mode expansions, but in one 
special case we also find a closed-form expression. We calculate the moments of the distribution and, in the univariate case, we discuss 
the Edgeworth expansion approximation. We also comment on the difficulties in a fast and exact numerical implementation of our 
results, and on possible future applications. 

Key words, cosmology - gravitational lensing - large-scale structure of the Universe - galaxies: statistics - Methods: statistical 



1. Introduction 

In several fields of science, there are observations that can be 
modelled as random processes, either time series or spatial ran- 
dom fields. In cosmology, mostly random fields are of interest, 
for example as the density perturbation field. Therefore, we use 
the language of random fields in this article. Still, all results are 
applicable to time series as well. 

An important statistical quantity of random fields is the two- 
point correlation function, labelled ^(x) in the following, with x 
being the separation between two points of the field. When the 
empirical correlation function has been measured, it can be used 
to estimate the parameters of a theoretical model for the ran- 
dom field. The standard method for this, at least in cosmology, 
is Bayesian inference, using Bayes' theorem: 



J d6'p(^\e')p(e')' 



(1) 



where are the model parameters, p(6\^) is the posterior prob- 
ability of some parameters given the data ^, p{^\6) is called the 
likelihood and p(ff) is the prior probability of the parameters. 
The prior can be chosen, more or less freely, by several schemes, 
and the integral in the denominator is usually not relevant for pa- 
rameter estimation studies, as only posterior ratios are necessary. 
But it is crucial to the Bayesian analysis to determine the correct 
likelihood function for the problem beforehand, a process also 
known as forward modelling. 

However, in many applications, such as those most relevant 
for cosmology, it is very difficult to obtain the correct like- 
lihood function either empirically or theoretically. Therefore, 



in those cases where the underlying random field is assumed 
to be Gaussian, it has been standard practice for some time 
to simply use a Gaussian approximation for the likelihood, as 
well. Examples include Fu et al. (2008) in a cosmic shear anal- 
ysis, Okumura et al. (2008) for luminous red galaxy counts, and 
iSeliak & Bertschinger. (.1993.) for the cosmic microwave back- 
ground correlation function. 

There is no a priori reason to assume that the Gaussian ap- 
proximation s hould be exact, or e ven very accurate. In fact, it 
was shown bv lHartlap et al.l (12009 ) that there is a significant de- 
viation between the real likelihood and a Gaussian for simulated 
cosmic shear data. In this case, the error bars on cosmological 
parameters improved by 10-40% when using a non-Gaussian 
likelihood. This result encouraged a mathematical study on the 
pro perties of correla tion functions of Gaussian random fields. 
Sch neider & Hartiapl (|2009) found that these correlation func- 
tions cannot take arbitrary values, but are subject to constraints. 
This can be seen from first studying the power spectrum of 
the field, which is Fourier conjugate to the correlation function. 
Power spectra are always non-negative, P(k) > for all wave 
vectors k. 

If the correlation function is measured over a separation vec- 
tor (or 'lag parameter') x and its integer multiples nx, n e N, 
the non-negativity of P(k) leads to constraints in the form of in- 
equalities r„i < r„ < r„u. Here, r„ - ^{nx)/^(0) is the 'correlation 
coefficient' normalised by the correlation at zero separation, and 
the upper and lower boundaries r„i and r„u are functions of all 
r, with / < n. The three lowest order constraints are < ^(0), 
-1 < ri < 1 and -1 + 2r\ < < 1. 
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Since a Gaussian or multivariate Gaussian is unbounded, the 
existence of the constraints already demonstrates that the real 
likelihood function cannot be exactly Gaussian, but at least must 
have its tails cut off. Still, the constraints do not immediately lead 
to a full descripti on of the likelihood function. Another study by 
ISato et alj (1201 lb has used the copula approach to construct a 
more realistic likelihood function for large-scale structure data. 
However, this approach is mostly useful for numerical applica- 
tion and does not yield direct insights into the analytical struc- 
ture of the likelihood functi on. Also, a pr eliminary investigation 
bylWilking &. Schneider (in preparationb has shown that a sim- 
ple approach using a Gaussian copula cannot correctly repro- 
duce the likelihood under constraints. Therefore, in this article 
we will focus on a simple type of random fields, but try to find a 
fully analytical expression for the likelihood function, or proba- 
bility distribution, of correlation functions, and will demonstrate 
that it is indeed manifestly non-Gaussian. 

For this derivation, we concentrate on Gaussian random 
fields, since their unique properties allow for a fully analyti- 
cal calculation. In addition, a Gaussian field is fully specified 
by its two-point statistics, i.e. ^(jc) or equivalently P{k), which 
makes the derivation of their probability distributions especially 
rewarding, since then P(k) determines the full set of joint prob- 
ability distributions p{^\), p{^\,^2), p(^i,^2,^i) and so on, with 

This article consists of five main sections, apart from this 
introduction. In Sect. |2] we derive the univariate probability dis- 
tribution function. We also calculate its moments and present ex- 
plicit results for a special power spectrum. Then, we repeat the 
derivation for bivariate distributions in Sect. |3] also discussing 
its moments. We go on to discuss possible numerical implemen- 
tations of these results in Sect.|4] Using numerical evaluation, we 
can discuss the properties of the distribution functions in more 
detail in Sect. |5] There, we comment on the general analytical 
properties of the uni- and bivariate functions. We also use the 
moments to construct an Edgeworth expansion of the univari- 
ate distribution, and we generalise our derivations and results to 
higher dimensions. We conclude the article in Sect. |6] 



2. Univariate distribution 

2.1. Derivation 

We describe a real Gaussian random field by its Fourier decom- 
position 



(2) 



The Fourier modes are independently distributed, each with a 
Gaussian probability distribution: 



P(gn) 



1 



no-i 



.-\g„\-/o-l 



(3) 



the whole of R^'^^" if the random field has no power on scales 
larger than L. This is also equivalent to the assumption of sta- 
tistical homogeneity on these scales. Together with isotropy, we 
know that the correlation function depends on the distance mod- 
ulus, or lag parameter, only: 



ax,y)^^(\x-y\). 



(5) 



We will now concentrate on a one-dimensional field to keep ex- 
pressions simple. However, in Sect. 15.41 we will see that all re- 
sults also hold for higher dimensions. We start with an estimator 
for the correlation function from the finite field, given by 



^(x) = {8(y)8*(x + 3')) = J" dyg(y)g*(x + y) , 





(6) 



which approaches the true value for L — > oo. 

Going to A:-space, the Fourier modes that fit inside the in- 
terval [0, L] are discrete. Each wave number kn has to fulfil the 
condition e'*^"^ = 1, so that 



_ 2^ 
n - —n 



(7) 



with n e N. For a real-valued random field, the Fourier compo- 
nents fulfil g-„ = g*. Using this property, the mode expansion of 
the field can be split up as 



(8) 



Without loss of generality, we assume that the field has zero 
mean, since we can always achieve this by a simple transforma- 
tion. Then, the zero mode go cancels out. We can then insert his 
expansion into the estimator (|6]l. For the spatial integrals, we can 
use the integral representation of the Kronecker delta symbol: 



/ 



\{'lnlL)x{n-m) 



L6n 



L if n — m . 
if n ^ m . 



(9) 



The correlation function is then given by 

^ oo oo 



;i=l m=l 



+^«^:,e-''-''""5„,„, + g*„glfi-'''"''6-„,,„) . (10) 



Executing the sum over m, only half of the terms survive, and 
the remaining exponentials give a cosine function: 



^(X) ^ l^lgnf COS(knX) . 



(11) 



The mode dispersions are determined by the power spectrum, 

1 



P{\k„\) . 



(4) 



The following derivations will be independent of the choice of 
a power spectrum, as long as it obeys the constraint of non- 
negativity. So the cr„ are arbitrary parameters. 

As is usually done in numerical simulations, we consider a 
finite field, x e [0, L]'^''"", with periodic boundary conditions. A 
sufficiently large finite field will be representative of a field on 



Now that we have a convenient expression for (the estimator of) 
the correlation function, we need to take one more intermedi- 
ate step before calculating its probability distribution. This is 
the characteristic function, which, in general, is defined as the 
Fourier transform of a probability distribution function. For the 
given random field, we can calculate the characteristic function 
by means of an ensemble average: 



iA(i) 



(e"fW)^. n ^'SnP(gn) 
Vn=I ^ / 



(12) 
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Since the field is Gaussian, the modes are independently dis- 
tributed, and the probability distribution factorises. Inserting 
(fTTT i. we get 



half-plane contribute to the sum of residues. We encode this be- 
haviour in the factor 



n=l ^ 



)e- 



21^1^,,^ cos(A:„.v) 



n, = mwiCn) - H(-^)H(-c„) , (19) 

oo 

; i/'„(i) . (13) where for the Heaviside step function we use the convention 



«=i 



In the individual factors iff„(s), we substitute z - IgnP to solve 
the integral: 



2n 







r . , r di^„i \g„\ I \g„\- \ ( 2 /, ^^ 

J d0„ J — — J— exp|^ — —^e.x^\2is\gn\ cos(^„x)j 
1 - 2\scr^^ cos(A;„x)\ 



dz exp -z 



1 



o-t 



1 - 2\sa-fj co^iknx) 



(14) 



With the product over all modes, we obtain the full characteristic 
function as 



Ms) = n r 



1 



oo 



2isC„ 



„=i - 2\scrlcos.(k„x) 
where in the last step we introduced the shorthand notation 

Cn - crfj cos(k„x) . 



(15) 



(16) 



The characteristic function is an important result in its own right, 
since we can use it to calculate the moments of the distribution, 
which we will do in Sect. 12.21 But for now, we go on to calculate 
the probability density distribution p{^) by an inverse Fourier 
transform: 

oo oo 



We will solve this integral using the theorem of residues, since 
the integrand is analytic except at its poles 



2C„ 2crlcos(k„x) ' 



(18) 



All of these lie on the imaginary axis. However, if cos(k„x) = 
for some n, then the corresponding factor in the characteristic 
function is unity, and there is no pole and no contribution to 
the integral from this term. On the other hand, some C„ may 
be equal, and so there may be poles of higher order (multiple 
poles). These special cases will be discussed in appendix lAl but 
for now we focus on the standard case of simple poles. 

We can choose a contour of integration made up of two parts, 
a straight section [R, R] combined with a semicircle of radius R, 
which we parametrise by i = Re'^, with either (p £ [0, n] for 
the upper or € [n, 2ji] for the lower half-plane. To get the full 
integral for p{^), we have to take the limit ofR — > oo. The numer- 
ator of the integrand is e""^, whereas the denominator, after ex- 
ecuting the product, is only a polynomial in s. So the numerator 
dominates the convergence behaviour, requiring %{-\s^) < 0. 
For ^ > 0, this corresponds to V>{s) < 0, and we can close the 
contour in the lower half-plane. If instead ^ < 0, the requirement 
is 3(s) > 0, and we can close the contour in the upper half- 
plane. So for a given ^, only the poles lying in the corresponding 



'I if x>0. 
H{x)^lQ.5 if x = 0. 

if x<0. 



(20) 



If all poles s„ are simple, we can calculate the residues by 

1 - 2i5C„ 



ReSj,, - lim 



is - s„) 



1 — 2iiC„ 



^ g-f/(2C„) 



(21) 



Inserting the winding numbers w„ - \ for the upper and w„ = - 1 
for the lower contour, the full integral is then 



/H °° 1 

2^^""^nTT^=2.t2-„Res. 
_ n=\ n 

— CO 



(22) 



This result holds for most relevant combinations of input power 
spectra and lag parameters. For other cases, we derive a gener- 
alised result of a very similar functional form, that also holds for 
multiple poles, in appendixlAl 

We were unable to further simplify the limit of the infinite 
sum in the probability distribution function (l22l l. However, as 
long as the power spectrum decreases at least like k^^ for large k, 
our numerical implementation of the sum formulae, as described 
in Sect.m showed that the probability distribution function con- 
verges as well. In practice, it is therefore possible to truncate the 
series at some maximum mode number without losing much 
precision. 

Also, it is obvious from Eq. (l22l l that for large ^, a single 
mode will always dominate the sum, so that asymptotically, the 
distribution is not Gaussian, p{^) oc e"^ , but instead exponential, 
p{^) oc e-f/<2c,„„\ 

We also note at this point that Eq. ( l22l i depends on the field 
size L, separation x and power spectrum P{\kn\) only through the 
ratios xjL and P(|A:„|)/L, as can be seen from the definition of 
C„ in Eq. ( fT6b . When we present numerical results in the further 
course of this article, we therefore give these quantities only. 
Furthermore, in the case of a Gaussian power spectrum. 



P(\kn\) 



1 



a-p 



V2^ 



(23) 



we can directly state La-p as the relevant quantity. 

For such a power spectrum with L<jp - 150 and a separation 
X -Q, Fig.[T]demonstrates the convergence behaviour of the dis- 
tribution function. In this case, the calculations with N - 6A and 
= 128 produce virtually indistinguishable results, so that the 
former mode number is sufficient for all practical purposes. In 
general, the steepness of the power spectrum determines which 
maximum wave number is necessary: Inserting the wave num- 
bers from Eq. O into Eq. (l2Jt . we see that N ~ Lcrp/2 is suffi- 
cient for Gaussian power spectra. 
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Me5 = 128;^ ncl+5ClY,Cl 
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(31) 
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Fig. 1. Univariate distributions for different mode numbers N, demon- 
strating convergence. All distributions have a Gaussian power spectrum 
with LcTp = 150 and lag x = 0. With maxima from left to right: 
solid: N = \, long- dashed: N = 2, dashed: N = A, dotted: N = 
long-dashed-dotted: N = 16, dashed-dotted: N = 32, double-dashed: 
N = 64, long-dashed-double-dotted: N = 128. 



2.2. Moments 

In this section, we calculate the moments of the distribution. 
Apart from possible use in future applications, this is also useful 
as a check for the distribution function derived above, since we 
can derive the moments in two independent ways and compare 
results. First, we can get the moments from the derivatives 
of the characteristic function (iKendall & Stuarti[T977l p. 63): 



Mi =i- 



dV(i) 



di* 



s=0 



Hi- 2isC„ 



The first derivative yields the mean of the distribution, or the 
expectation value of ^: 



1 = -1 



ds 



.?=() 



(25) 



For the variance and other higher-order quantities, we use the 
central moments, which are the moments of the distribution of 
f - ^. The centralised characteristic function is simply 



n=l ^ ' 



(26) 

and it yields the central moments as (iKendall & Stuart|[l977l pp. 

57, 63) 



Mck = i 



ds* 



=0 



The first six non-zero central moments are then 

oo 



n=I 



oo / 



(28) 
(29) 
(30) 



Mr, 



= 320 £ 53C,^ + 27C:l + ^^C^ ^ Cl 



+ 18 ^ (CnCmCkf 



k>m>ii 



(32) 



From these moments, we can also obtain some conventional sta- 
tistical quantities: the variance V(^) = Mc2, the standard devia- 
tion o" = -sJV(^), the skewness S(^) - Mci/cr^ and the kurtosis 

m) = M,4/(T^ - 3. 

Alternatively, we can also calculate the moments from the 
probability distribution function by the integrals 



(33) 



An important check for the sanity of the distribution function 
will be to re-obtain the normalisation as unity by this approach. 
We can compute it as the moment of order zero: 



OO OO ^ oo 

(1=1 m#n ^ ^ 



(34) 



We can evaluate this sum of products by considering another 
(24) Fourier transform from p{^) back to t//{s): 



OO 

= J d^e''^;7(^) 

— OO 

oo 

oo 

Za„ 1 a„ 



l/(2C„)]f 



c„>o ^'^« ^■^ ~ 2^ £^0 ~ w;, 

oo oo 

, 2C (is - ~ ^ 1 - 2iiC„ ■ 

n=l ^>--n ^i'' 2C„ / "=I 



(35) 



Comparing this expression for the characteristic function to the 
original from Eq. (flST l, we get 



oo ^ oo 

^^^j 1 - 2i5C„ ~ ^ 1 - 2\sCn 



Evaluation at i = yields 



^ fl„ = 1 



(36) 



(37) 



so that together with Eq. ( l34l i we have shown that N - I. 
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To calculate the higher moments, we make use of the integral 



oo 

/ 





-f/(2C) 



with k>0 



(38) 



and of sum formulae of the type 

N N 



n-l mi^n Cn "=1 

N N . N N 

— 1 11 — 1 lit — n 



N N 



C„ n=l 
N N 



/i-l mi^n C„ n=l m=n k=m 



Ck. 



(39) 



(40) 



(41) 



These follow from taking derivatives with respect to s in Eq. (|36] | 
and setting s - Q. We have checked the results for mean, vari- 
ance, skewness and kurtosis and have reproduced the results of 
the characteristic function approach, demonstrating the validity 
of the probability distribution function. 



2.3. Cumulative distribution function 

From the probability distribution function (l2Zt . we can also di- 
rectly calculate the cumulative distribution function, defined as 
F{^) - P(^' > For single poles only, it is given by 



oo 

°° 1 1 r 

' oo 

Jdr./dr 



)e 



-?/(2C„) 



-i/(-^) 



f/(2C„) 



oo 

=x(H,e-^^^^^"'.//(-^))n-V 

.1 — 1 II -f- 111 ^ 



(42) 



Again, we use the notation "K,, = H(^)H(C„) - H(-^)H(-C„) 
for the Heaviside factor, but this time note the extra term of 
+H{-^). Analogous expressions in the presence of higher-order 
poles could be obtained by integrating the corresponding proba- 
bility density (IA.3b . 



2.4. A special case - power-law power spectra 

In general, the probability distribution function we found is a 
sum formula that needs to be evaluated numerically. However, 
if the power spectrum of the underlying random field is a power 
law P{k) cc \k\^^', we can analytically find a more explicit expres- 
sion for the univariate distribution function. In the case of 



P(k„) = A\k„ 



-2 



(43) 



with A a normalisation constant, and for a separation of jc = 0, 
we have C„ - LA/ and the product factors are 



" n -Tz " n 7 — ^Y? — A ^ ' 

,„±n 1 ,.,2 „,i„ (1 - + 



(44) 



which is a special case of the infinite product family 
(lPrudnikovetal.ll 19861 p. 754) 



oo / J > k-1 J 

m=l ^ ' m=0 



(45) 



The probability density function (at zero separation) is then 



Ma = ^^i/(^)2(-ir'«v2- 



iTrrp-^iiLA) 



(46) 



where the field size L comes in through Eq. (|4]i. We can now 
express the cumulative distribution function in terms of known 
functions as 



oo 

/ oo 
d^X^) = 2i/(f)2(-l)«+'e-2-^- 
^ n=l 

= //(^)[l-,?4(0,e-2-f/("')] . 



(47) 



Here, §4 (0, q) is a special case of the Jacobi elliptic theta func- 
tion, which is known to satisfy tWhittaker & Watsoa,1963> p463 
ff.) 



(48) 



We can then re-obtain the probabiUty density function by differ- 
entiation, 



p(^ = -^(^)^^4(0,e-2-^/<")) 



(49) 



2^ 
LA 



>?;(o, 



where (0, q) 



- il?^^. It is not clear to us yet whether this 
connection to elliptical functions is a coincidence for just this 
special case, or whether it points towards a possible reformula- 
tion of the probability distribution for general power spectra. 

However, it allows us to analyse the asymptotic behaviour of 
the distribution function for large ^. From Eq. ( l48T l, we have 



&\(0,q) = 2^(-l)"nV'"' = -2 -H 8^^ - 18^^ + . 



(50) 



~ -2 + 8q^ 

for q I. Then, inserting into Eq. (ISOl l, we obtain 



p(^) oc e 



'?;(o,( 



-2iri^HLA) 



(51) 



for ^ » 1 . Thus, the distribution function behaves like a single 
exponential at large ^, and not like a Gaussian at all. This is in 
agreement with our general result p{^) cc e^f/'^*-"'"* for large ^ 
from Sect. O 

For the same power spectrum, we can also explicitly calcu- 
late the moments. With C„ oc P(\k\) oc \k\^^, we obtain the m-th 
(central) moment from the sum 



00 . 

y — 



^(2m) 



(-2> 



2m- 1 —2m 



(2m)! 



-B2n 



(52) 
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Here, i^( m) is the Riemann zeta f unction and B„i are the Bernoulli 
numbers (IPrudnikov et al MM p. 776) given by 



DO „ 



(53) 



For example, the mean of the distribution then is simply given 
by ^ - LA 1 12 and the standard deviation is cr^ - LAj {6 VlO) . 

For power law power spectra with a different exponent, 
Eq. (l45T l still allows us to express the product factors explic- 
itly in terms of known (trigonometric and hyperbolic) functions. 
Regrettably, these do not yield any known functions for the full 
probability distribution, or for the moments, as far as we are 
aware. Thus, a really expUcit form was found for the special case 
QfP(k) oc \k\-^ only. 

3. Bivariate distribution 

3.1. Derivation 

In this section, we will calculate the bivariate probability distri- 
bution function p(^{xi),^(x2)), which we will mostly abbreviate 
as p{^i,^2) with ^1 - ^(xi), ^2 - ^(xi). All preliminaries carry 
over from the univariate case, and the starting point of this cal- 
culation is 



^(xi) = 2^ |^„|^cos(^„x,). 



(54) 



The characteristic function is now bivariate as well. 



lA(si,S2) = 



Y] d^gnP(gn) 
n=l ^ 

OC ^ 

n — 

[J 1 -2i(iiC„i + S2C„: 



i(-!lft+S2ft) 



:) 



(55) 



the other choice. Also, we will assume simple poles in both inte- 
grations, and will only briefly comment on the effects of multiple 
poles at the end of this section. 

The poles for the inner integration are now located at 



1 



S2n 



C„l 

2iC„2 C„2 
and, for simple poles, their residues are 

1 



Si 



(58) 



ReSjj^ - 



lim 

_! £ni 

2iC„-, Q,- 



S2 - 



C„i \ e-'(*ifi+-'^ft' 

-: 1 Si 

2iC„2 Cn2 I 27r 



Or 



1 



2i(iiCmi + S2C,„2) 

-ft/(2C„2)-i.!l[fl-(C„l/C„2)ft] 



27r 2C„2 



n 



C„2 



Vm#« ^ 2iD„,„ 



(59) 



Here we have simplified the expression by defining the determi- 
nant factor 



D„ 



nl^ml Cffi 1 Cu2 



det 



Cn2 Ciji2 



ml 



(60) 



The arguments as to the choice of contours apply exactly as be- 
fore, since the imaginary part of the poles remains unchanged 
from Eq. (fTST i to Eq. ( ISSl l. Thus, poles with positive C„2 factors 
lie within the lower contour, whereas those with negative C„2 lie 
within the upper contour The corresponding Heaviside factors 
encoding this behaviour are H{^2)H(C„2) and H(-^2)H{—C„2). 
The winding numbers are w„ - I for the upper and w„ - -I for 
the lower contour So we obtain the full integral as 



In the last step, we have defined a generalised shorthand for the 
factors C„„, = cr^ cos(A:„jic„,) to allow for the two different lag 
parameters x,„. It is worth noting at this point that for higher 
multivariate distributions, say p{^\,^2, ■ ■ ■,^k), the only change 
necessary in the characteristic function will be to add additional 
terms of s„,Cnm in this factor, resulting in the generally vahd 
expression 



(A(si,S2,---,S/t) = ]~~[ l-2i^5;„C, 



(56) 



Next, we will obtain the bivariate probability distribution itself 
from the characteristic function by Fourier inversion, in analogy 
to the univariate case. But an important diflFerence arises in this 
step, since the inversion now contains a double integration. Thus, 
we have to calculate the following: 



oo oo 

/It/ 

— oo —CO 



AS2 



^-i(ilfl+.52ft) 



27r 



(57) 



n [1 -2i(iiC„i +i2C„2)] 

n=l 



Since the pairs of variables { s\ , S2) and (^1 , ^2) each are mutually 
independent, the result has to be invariant under exchanging the 
order of integration. We choose to first integrate over di2 and 
after that over d^i . From the resulting formula, the symmetry 
will not be immediately apparent. However, we have checked 
the equivalence of both approaches by also explicitly evaluating 



00 00 

00 

= J" ^27ri ^ w„ ReSi2„ 



= 2 [H{^2)H(Cn2) - H(-^2)H(-C„2)] e 



1 



■ 2i(iiC„i -I- i2C„2) 



f2/(2C„2) 



n=l 



27r 



2C„2 



n 



C„2 



ym^n '™ '^1 +1 2D„,„ 



2iD„ 



(61) 



The remaining task is to calculate the second integral. For ease 
of notation, we wiU substitute si — > s and introduce the new 
variables 



(-,,2 



and = '^'f . (62) 



Then, the second integral can be reduced to the calculation of 
the term 



r A ( ^ \ 



(63) 
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This integral has poles at = -ij6„m- For simple poles, the 
residues are 

Res,„,„ = lim (s + i/3„„,) ^s''""" ^ 



2^ n(^ + iyS„p) 



n ( /^n p Pnm ) 



(64) 



Furthermore, the choice of contours is also very similar to the 
previous procedure. We will again close the contour with semi- 
circles in either the upper or the lower half-plane, and the purely 
imaginary poles i,„„ = -i/3„m lead, by the same convergence 
argument, to Heaviside factors H{a„)H{/3„m) for the contour in 
the lower half-plane and H{-a„)H{-f3„m) for the contour in the 
upper half -plane. With the usual winding numbers w,, = ±1, the 
integral is 



S„ — 2n\ ^ w„ ^e.^s,„„ 

n 

= [H(a„)H(J3„,„) - Hi-a„)Hi-l3„,n)] 



p+m 

(65) 

Reinserting this result into the full expression (l6ll . the bivariate 
probability distribution function is 



(66) 



/ (C„2-C„,,)fi+(C,„i-C„i)fe \ 



pi^m 



where we used a shorthand notation for all of the Heaviside fac- 
tors: 

= [//(^2)//(C„2) - H{-^2)m-Cn2)] 

X [Hia„)Hil3„^) - Hi-a„)H(-/3n,n)] . (67) 



Finally, we can bring this expression to a more symmetric form 
by reinserting the /3„„, and shifting around some factors: 



.2-C,„2)fl+(C,„l-C„|)f2 



(68) 



exp(i^ 



2D„, 



n [^np 



Dn 



p+n 
p+m 



This derivation is only valid if, in both integrations, none of the 
poles vanish or are of higher order But like we do for the uni- 
variate distribution in appendix |A] we can find corrections to 
get the most general result. In this case, they become rather un- 
wieldy, and we will not present them in this article, since they 
can be easily avoided by choosing well-behaved power spectra 
and non-commensurable lag parameters. However, in Fig. |2]we 
show results for lag parameters which produce such zero modes. 
The corrections were implemented in the numerical code, as de- 
scribed in Sect.m and produce smooth, non-singular results. 

Both panels show results for an one-dimensional field with 
a rather narrow Gaussian power spectrum. Lap - 20, since this 
allows for convergence with a small number of modes. = 16 
was used for the diagrams. The left panel shows the distribu- 
tion for separations x\IL - 0.15 and X2 = 0. It is obvious that 
the distribution obeys the constraint \^{x\)\ < ^(0), with zero 
probability outside the triangular region defined by this con- 
straint. It is also clearly asymmetric in ^i, demonstrating the 
non-Gaussianity. Marginalisations over either ^\ or shown in 
Fig.|3] also demonstrate the non-Gaussian nature of the bivariate 
distribution, and are consistent with our univariate results. 

The right panel of Fig. |2] shows the case of x\IL - 0.2, 
X2IL - 0.25. Here, no constraints are visible, since we have not 
fixed any specific value of ^0, so that the distribution is averaged 
over all realisations with arbitrary ^0 and so both ^\ and ^2 are 
unbound. Still, the non-Gaussian nature is apparent in this case 
as well, since the isoprobability contours have kinks and straight 
segments following the prescription of the Heaviside terms. 

3.2. Moments 

There are analogues of mean, variance and also of higher mo- 
ments for multivariate distributions. As we did in Sect. 12.21 for 
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integrals in 



4 -0.3 -0.2 -0.1 0.1 0.2 0.3 0.4 



Fig. 3. Univariate distributions obtained by marginalisation from the 
bivariate pi^i^^i), for a Gaussian power spectrum with Lap = 20, 16 
modes and jc/L = (0.15,0). solid: p(<fi), dashed: pi^i). 



the univariate distribution, we can calculate them either by inte- 
grating the distribution function or by differentiating the charac- 
teristic function. Since two-dimensional integrals are more cum- 
bersome, we restrict ourselves to the characteristic function ap- 
proach for the bivariate moments. 

The mean is now a vector, but calculation and result are quite 
similar to the univariate case in Eq. (IZST i: 



2 2 Cn\ 

oo 

V n=I 



(69) 



For the central moments, the centralised characteristic function 
is again easy to obtain: 



>/^cisu S2) := (e 



-i(.Sl^^l+.!2ft) 



(70) 



The analogue of the variance in the bivariate case is the 2x2 co- 
variance matrix, which we get from the second centralised mo- 
ments: 



/Ell "^ii] _ I -dlipcisusi) -d\d24ic{si,S2) 



-d2diil/c{s\,S2) -dhpc{si,S2) 



^ 1-0=52 



4 2 4 2 C„iC„2 

n= 1 «= 1 

00 00 

4 2 C„iC„2 4 2 Cl^ 

V n=l n=l 



(71) 



Higher central moments M^^k could be obtained by building ten- 
sors of rank k from the components 



■,-k 



dhdh---dh'lJc{si,S2)\ 



.51=0=52 



(72) 



3.3. Higher multivariate distributions 



In this work, we have derived both univariate and bivariate distri- 
bution functions, but no multivariate distributions of more than 
two correlation functions. In principle, however, these could be 
obtained by exactly the same type of derivation. From the gen- 
eral multivariate characteristic function, Eq. (ISFt , the A:-variate 
distribution function can in general be obtained by solving the k 



J 27r J 27r J 27r 

—00 —CO —CO 

k \ °^ ( ^ 



X exp 



^ m=l 



(73) 

Since the integration was already quite complicated for k - 2, 
this is not very practical for higher multivariates. 

However, since the multivariate characteristic function is not 
very complicated and can be computed directly from the input 
power spectrum, an alternative approach would be to calculate 
this numerically on a grid in s space, and then do a numerical 
complex-to-real Fourier transform on these values to obtain p(^) 
on a corresponding grid in ^ space. 

This approach could be quite efficient for practical purposes, 
where the likelihood function is only needed at discrete points 
anyway. However, the efficiency and stability of this approach 
depends strongly on the input C„,„ factors, since the denominator 
in the integrand might prove to be numerically hard to handle. 
Estimating the performance of such a calculation would require 
further studies. 

4. Numerical implementation 

The numerical implementation of the probability distribution 
function, Eq. ( |22] ). or of the cumulative distribution function, Eq. 
(l42l i. as required for a Bayesian parameter estimation or even for 
simply plotting the functions, is not trivial. For a small number 
of modes and benign parameters (number of dimensions, power 
spectrum, lag parameter), this can be done straightforwardly in 
any computer numerics system. In general, however, the sum- 
mation of many terms with very different orders of magnitude 
(due to the exponentials and the product factors) leads to a high 
demand in numerical accuracy. If the internal accuracy of the 
software is less than the number of significant digits in the sum- 
mands, rounding and addition errors lead to uncontrollable er- 
rors in p{^) and all related quantities. 

We solve d this problem by using the arprec package 
dBailev et al. |r2002). available for C, C-i--t- and Fortran, which 
allows calculations with up to 1000 decimal digits. The higher 
the ratio of separation and field size or the number of signifi- 
cant modes, the higher the necessary precision. However, a large 
working precision significantly increases the run-time for each 
evaluation of the likelihood function. Therefore, for a fast and 
stable implementation suitable for parameter estimation, a way 
has to be found to determine the necessary precision beforehand. 

For plotting the distribution as a function of ^, we only need 
to compute the product factors once. But in a parameter estima- 
tion, the likelihood is evaluated for a different power spectrum 
in each step, and so an efficient computation of the product fac- 
tors is also necessary. We will describe one such possibility in 
Sect. 15.21 which, however, comes with its own numerical issues. 
Further investigation into efficient and stable implementations 
seems necessary. 

5. Discussion 

5. 1 . Analytic properties 

The probability distribution function, Eq. (l22l l. has some inter- 
esting analytic properties. For lag parameters x 0, the distri- 
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bution (as the full infinite sum) is non-vanishing both for pos- 
itive and negative ^, and infinitely differentiable at all points. 
However, there are some subtleties when the sum gets truncated. 
We can most easily see this for the special case of x = 0, where 
all C„ are positive, and we obtain 



(74) 



with the fl„ defined as in Eq. ( |34] |. Thus, we have p{^) > for 
^ > and p(^) - for ^ < 0. The first derivative is 



with the Dirac delta function doix) = 



dH(.v) 
dx ■ 



At ^ = 0, we have 



(76) 



For the last step, we had to define ■ ^d(O) - 0, and to use the 
equality 



(2c„y 



k+l 







(77) 



for k = Q and k - I, which can be proven by the same argument 
used for the normalisation in Sect. 12.21 equating two different 
forms of the characteristic function tlr{s) and then taking the ^th 
derivative in s. 

So far, we have seen that /:i(0) - and p'{0) - 0. If we 
consider the full infinite sum for p{^), the same is true for all 
derivatives 



g -f/(2C„) ( 1) ^" +(...), 



(2C„> 



k+\ 



(78) 



where (...) stands for terms proportional to doix) and its deriva- 
tives: We find that p(^) can be differentiated infinitely often at all 
points along the real axis, with all derivatives vanishing at ^ = 0. 
But, if the sum is truncated at some arbitrary mode number A^, 
this is no longer true. In this case, only the first - 1 derivatives 
will vanish at the origin, and higher derivatives will be discon- 
tinuous at this point, making p(^) differentiable only A^- 1 times. 
This also holds for x 9^ 0: a sum truncated at modes is only 
A^ - 1 times continuously differentiable. Still, this phenomenon 
does not harm the convergence of the sum, and the truncated 
function can still be used as a good approximation of the full 
sum if A^ is chosen sufficiently large, as we have demonstrated 
numerically. 

However, it is of mathematical interest to note that even the 
infinite sum version of Eq. ( l74l l has peculiar properties at ^ = 0. 
If we consider the function p(^) on the complex plane, the direc- 
tional derivatives anywhere but on the real line will not vanish, 
while they do along the real line. Thus, the function is not ana- 
lytic in any neighbourhood of ^ = 0, and therefore the origin is 
an essential singularity of this function. This phenomenon can 
also be seen in the special case discussed in Sect. 12.41 where 



2n^ 



i?;(o. 



-27r'-^/(LA)\ 



(79) 



The elliptic theta function shares just this analytic behaviour 
Still, these functions are smooth for the purposes of real cal- 
culus, and there are no problems in practical applications of our 
results due to this phenomenon. 



Also, discussion in the complex plane allows for another in- 
teresting result. Recalling the characteristic function from Eq. 
], which is in general a complex function, we have 



11=1 «-l " 

(80) 

Obviously, the real part is an even function in s, while the imag- 
inary part is odd. Considering the back transform to the proba- 
bility distribution function, we find 



CO 

= / ^[cos(i^)-isin(*^)] [%(i{r(s)) + i5ms))] 

oo 

" / S ['^"'^^^^^ + sin(i^)3 (ifris))] 

— oc 

oo 

+ i J ^ [cos(^^)3 ms)) - sinisOnms))] . (81) 



Therefore, the real part of p(^) contains an even-even and an 
odd-odd term, whereas the imaginary part consists of two even- 
odd terms, which vanish under the integration. This way, we 
have proven that the probability distribution function is indeed 
purely real, a fact that was not immediately evident from the 
original derivation, which had the complex characteristic func- 
tion as an intermediate step. 

Going to the bivariate distribution function, we find analyti- 
cal issues similar to those in the univariate case. If both lag pa- 
rameters are non-zero, p(^i,^2) is non-zero in the full (^1,^2) 
plane and smooth everywhere. If, however, one of the ^, - 0, 
then the probability density is strictly zero outside the bound- 
aries defined by the constraint inequality |^(x)| < |^(0)|, with the 
function going smoothly to zero at these boundaries, in the sense 
that the real directional derivatives (partial derivatives along unit 
vectors) vanish when crossing the boundaries. However, we have 
only checked this smoothness numerically, since an analytical 
calculation of the derivatives would become rather convoluted. 



5.2. Alternative calculation of product factors 



In principle, the distribution function (l22l i is simply a weighted 
sum of exponentials. However, the weights are given by the 
products 



(82) 



These make the summation complicated, since closed-form ana- 
lytic expressions exist in special cases only, and high accuracy is 
needed to compute them numerically. Therefore, we were look- 
ing for alternative ways to calculate these products, and found 
an approach by a linear set of equations (LSE). 

We obtain the first equation of this system from the normal- 
isation, as calculated in Sect. 12.21 



(83) 



J d^/7(^) = J]fl„ 
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Fig. 4. Edgeworth expansions compared to full analytical distributions, for Gaussian power spectra. 

Left panel: x = 0, Lap = 20, A' = 16; solid: analytical, dashed: 0-order Edg., dotted: 3rd order Edg., dash-dotted: 4th order Edg. 
Right panel: x/L = 0.5, Lcrp = 100, A' = 32; solid: analytical, dashed: 0-order Edg., dotted: 3rd order Edg., dash-dotted: 6th order Edg. 



For the remaining equations, we consider again the smoothness 
of the distribution function at the origin. As already discussed in 
Sect. 15. II the first A^- 1 derivatives of p{^) with respect to ^ yield 
the equations 



N 



a„ 



''''"'-lac. 



k+\ 



= 0. 



(84) 



Then, we can consider the factors a„ as the components of a 
vector a - {ai,a2, . . ., a^)- Combining this with another vector 
ft = (1, 0, 0, . . . , 0) with entries and with the x coefficient 
matrix M of the above equations, we obtain the LSE 



1 

(2Ci)- 
(2Ci)- 



1 

(2C2)- 
(2C2)- 



(ICxy (2C2)- 



1 

(2C^) 
{2Cn) 



(2CMr 



\ 






(l\ 




fl2 









as 







J 






.0. 



(85) 



We can then solve this LSE by any of the usual methods. In par- 
ticular, we applied the Singular Value Decomposit ion (SVD) as 
implemented in the GNU Scientific Library (GSL, iGalassi et alj 
[2009). This implementation has limited precision and there- 
fore runs into the same numerical problems described in Sect, 
m However, SVD solutions for many numerically challeng- 
ing problems exist, and therefore the general approach seems 
promising. 

5.3. Edgeworth expansion 

Since the moments have simpler expressions (Eqs. |25] and |28] 
to [32l i than the probability distribution function itself, it seems 
promising to express the distribution in terms of its moments. 
One way to do so is the Edgeworth asymptotic expansion, de- 
scribed in .Blinniko v & Moessner ( 1998). It has the form 



"=1 IK, 



kj. 



f 

11 

n 



(86) 



m=l 



a-2™+2(OT-i-2)! 



with the (probabilist's) Hermite polynomials He„, and ^ and cr 
as given in Sect. 12.21 The inner sum runs over all sets {k„i] of 
non-negative integers solving the Diophantine equation 



^ mk,„ = n , 



and we also defined, for each such set. 



(87) 



(88) 



We calculate t he [km] and r with the alg orithm presented in 
Appendix C of iBlinnikov & Moessner (^998). 

The first term in the Edgeworth expansion is a simple 
Gaussian, and the higher order terms are given by the cumulants 
Kn of the distribution in question, which we can derive from the 
central moments Mc,„ as 



_^/«-l\ 
^ \m - 1 / 

117= 1 ' ' 



(89) 



In Fig.m we demonstrate the performance of the Edgeworth ex- 
pansion in two examples, both for Gaussian power spectra. The 
left panel shows a distribution at zero lag, with Lcrp - 20 and 
16 modes. The Edgeworth term of order zero, a Gaussian with 
the same mean and variance as the full distribution, is a very bad 
fit in this case. The third-order Edgeworth expansion is the best 
fit, fitting the peak of the distribution almost perfectly and the 
tail decently well, while also producing negative probabilities 
for some ^ < 0. Adding additional terms only makes the approx- 
imation worse, distorting it in the high probability region. For 
the right panel, we used x/L = 0.5, Lcrp = 100 and A^ = 32. For 
this more symmetric distribution, the Gaussian is already a bet- 
ter fit; still, the third order Edgeworth expansion fits even better 
at the peak. Higher orders again begin to deviate, as evidenced 
by the strongly two-peaked sixth-order expansion. 

The reason for this behaviour is that, for the higher terms, 
the Edgeworth expansion assigns high weights to the tails of the 
distribution. Since the distribution we are analysing is in fact far 
from Gaussian, especially in the tails, the fit to the peak coiTe- 
spondingly gets worse, and often even multiple peaks appear. 
This is similar to the effect noticed by IBlinnikov & Moessneil 



10 



D. Keitel and P. Schneider: Constrained probability distributions of correlation functions 



30 p 
25 
20 
S 15 - 
10 - 

5 - 





-0.3 -0.2 



-0.1 0.1 0.2 0.3 




Fig. 5. Univariate distributions for a power-law power spectrum and different separation vectors x, demonstrating isotropy in 2D. 
Left panel: dashed: x/L = (0.3, 0), dotted: x'/L a (0.212, 0.212), Right panel: dashed: x/L = (0.03, 0), dotted: x'/L x (0.0212, 0.0212) 



(Il998l) for the example of distributions. Also, the regions 
of negative probability density are generic features of the 
Edgeworth expansion when applied to strongly non-Gaussian 
distributions. 

Still, if we truncate the Edgeworth expansion after a wisely 
chosen number of terms, it will provide a good approximation to 
the true probability distribution. Since Eq. (l8&t is an asymptotic 
series expansion, it generally does not converge, but we do have 
a method at hand to find the optimal number of terms and to con- 
trol the error we make. If we truncate the sum over n in Eq. (l86T l 
after terms, the last term retained will also give the order of the 
difference between the full p{^) and this partial sum expansion. 
Therefore we can, in practice, simply truncate the expansion at 
the term with minimal contribution (measured at the peak of the 
distribution, at a certain point of evaluation, or integrated over a 
domain in f we are interested in). 

For both cases illustrated in Fig.|4] this criterion gives N = 3 
as the optimal order of expansion. Also for other parameters x/L 
and L crp, such a third-order Edgeworth expansion seems to be 
a safe choice to get a substantial improvement as opposed to a 
simple Gaussian likelihood. 

5.4. Higher dimensions 

In all of the above calculations, we assumed one-dimensional 
random fields. However, we can easily generalise all results to 
higher dimensions. If we go to A^dim dimensions, with lag pa- 
rameters X = (xi, . . . , jcai,.,^), then the allowed Fourier modes are 
aU 

y«= y(«i,...,«iv,iJ (90) 

with integer m. Still, all the modes are independent and each has 
a Gaussian probability distribution with its dispersion given by 
cr„ = P(\k„\)/L^^^^. The derivation of p(^) stays exactly the same 
as presented in Sect. 12.11 Where necessary, we can renumber all 
modes n with a single integer n by an arbitrary scheme and retain 
the old notations with scalar indices. Then, Eq. ( l22l i still holds, 
with only two important changes. 

First, the sums now go over many more modes, namely all 
possible vectors of integers n = {ni, . . . ,nN^.^J with n, e N. 
If, in a numerical implementation, we want to cover a box in 
A:-space with grid points in each direction, we therefore end 
up with N^'>'>«> modes. Besides the increased computational cost, 
this also leads to frequent occurrences of multiple poles. This is 



the case especially for x - 0, since then the C„ depend on \n\ 
only. Therefore, the generalised result of appendix |A] gets nat- 
urally important in higher dimensions. However, in real appli- 
cation scenarios, where accuracy is limited by external factors 
anyway, it is always possible to avoid multiple poles by slightly 
changing the C„ factors, e.g. by adding a small number e in the 
cosine, 

C„^C'„^<Tlcos(xk„ + e). (91) 

Doing this removes the multiple poles while only sUghtly chang- 
ing the results with respect to using the unmodified C„ and the 
full multi-pole formula, Eq. ( IA.3b . 

As a second effect, the factors C„ now depend on the angle 
between separation vectors: and mode vector k„: 

C„^crlcos(xk„). (92) 

However, a Gaussian random field is completely determined by 
its power spectrum, and when we assume a P(k„) which depends 
on the absolute values of the k„ only, such a field should be sta- 
tistically isotropic. Any anisotropics seen in our results must be 
a consequence of using a finite, cubic field instead of an infinite 
field. So we expect that all anisotropics vanish as soon as most 
of the power comes from scales much smaller than the field size. 
For power spectra concentrated narrowly towards ^ = 0, i.e. 
those with only a few significant modes, the low-k, large-scale 
modes will dominate for any finite field size. But for a wide 
power spectrum, where many small-scale modes contribute, a 
sufficiently large field size should lead to approximate isotropy. 

This is demonstrated for a two-dimensional field with a 
power spectrum P(k) = k^^ in Fig. |5] Both panels show the 
probability distributions for two separation vectors x and x' of 
the same length, but rotated by 45°. (Or, more precisely, with 
X - (;ci,0) and jc' - (xi cos(45° -H e), jci cos(45° - e)), with 
e = 10""^, to avoid double poles.) For a large separation to 
field size ratio, \x\/L - 0.3, seen in the left panel, the distri- 
butions for different separation vectors are quite different, while 
for \x\/L = 0.03, in the right panel, they have almost converged. 
Note that, for P{k) = k^^ in 2 dimensions, the field size L cancels 
out of the cr„, and therefore we could keep the power spectrum 
normalisation constant while changing \x\/L, without changing 
the scale of p{^). 

Isotropy is even easier to obtain for power spectra that are 
small both for very small and very large k and large only at in- 
termediate wavelengths, since then the low-k, large scale modes 
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do not harm isotropy. The ACDM power spectrum in cosmol- 
ogy fulfils this condition, allowing isotropic N-body simulations 
with reasonable field sizes. 

However, these are no fundamental changes, and the result- 
ing distribution function has all the same properties as the one- 
dimensional version. So all our results can be readily applied to 
higher dimensions, as long as the computational difficulties can 
be handled. 



fluctuations at that epoch were still either Gaussian or close to 
Gaussian. Furthermore, this work could also be relevant to fields 
outside of cosmology, for example in the common problem of 
time series analysis of Gaussian random processes. 
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6. Conclusions 

We have considered the problem of accurate likelihood func- 
tions for Bayesian analyses of data from Gaussian random fields. 
Making use of Fourier mode expansions and characteristic func- 
tions, we have derived analytically the probability distribution 
function of the correlation function for a one-dimensional finite 
Gaussian random field. For general power spectra, we can only 
give a sum formula for the distribution function. However, for 
the special power spectrum P(k) cc k^^, we have found an ex- 
plicit expression in terms of elliptic theta functions. We can also, 
for general power spectra, calculate arbitrary moments and cu- 
mulants of the distribution by much simpler sum expressions. 

Then, we continued the analytical approach forbivariate dis- 
tributions as well, and found a similar, but even more compli- 
cated sum formula as a result. We have also outlined a gen- 
eral procedure for calculating arbitrary high multivariate distri- 
butions, though this is not feasible in practice. 

Furthermore, we have considered the analytical properties of 
the new probability distribution function, which are well under- 
stood and consistent with numerical results. We used the mo- 
ments of the univariate distribution to construct an Edgeworth 
expansion, which is able to closely approximate the true distri- 
bution function in the region of highest likelihood, and therefore 
could be a useful replacement of simple Gaussian approxima- 
tions in Bayesian analyses. Finally, we found that all our results 
easily generalise to multi-dimensional fields. 

Considering possible future applications of these results, we 
have to point out the importance of improving the correlation 
function likelihood, as well as the further steps that are necessary 
for a practical implementation. 

As already mentioned in the introduction, the Gaussian ap- 
proximation for the likelihood can lead to considerable devia- 
tions in parameter estimation. For example, in cosmic she ar stud- 
ies, th is leads to significantly reduced accuracy (Hartl ap et alJ 
Similar effects are to be expected in other fields where 
correlation functions are used. 

However, the work presented in this article has so far been 
purely mathematical, and the results are not readily applicable 
to real data. The main obstacle lies in the infeasibility of ana- 
lytical calculations for higher multivariate distributions. If data 
of the correlation function over bins needs to be analysed, we 
would need the full A^-variate distributio n function. Therefore, 
we expect that a numerical approach, as bv lWilkinf? 1^ Snhneided 
din preparation!) , is best suited for practical computations. Still, 
their 'quasi-Gaussian' approach makes direct use of the analyt- 
ical univariate distribution function presented in this article. We 
also expect that the analytical results will yield important guid- 
ance and cross-checks for future numerical implementations. 

We also note that our analytical results depend on the as- 
sumption of a Gaussian random field, whereas a lot of cosmolog- 
ical data probes the evolved density field on small scales, which 
is far from Gaussian. Nevertheless, applications for the analyt- 
ical likelihood function could be found on very large scales, 
or in cosmic microwave background analysis, since the density 



Appendix A: IVIultiple poles 

So far, to keep calculations simple, we have considered simple 
poles only. However, it is entirely possible to have multiple poles 
in the characteristic function, i.e. to have some mode numbers 
m + n with Cm - C„. This could happen if 

- the power spectrum is non-monotonic, with P{k„) - P{km) 
for some m + n, 

- in higher dimensions, several different modes n have identi- 
cal absolute value \k„\, 

- one of the lag parameters is commensurable with | and thus 
produces periodicity in cos(jc • k„). 

From now on, we will use a scalar index n running over all 
modes n by some arbitrary numbering scheme. If a multiple pole 
of order k is present for some set of modes, which we will call 
N* - {m e N|Cm = C„), we can calculate the residue as 



at-i 



ReSt - 



lim 



( {s - .„)^e-"^ 
(1 -2isC„)* 



2\sC„ 



(A.l) 

where the weight A^„ is the number of modes in N*. We can then 
rewrite p{^) as the usual sum, limited to single pole modes, plus 
correction terms for the multiple poles. For example, when there 
are some double poles N* and all other modes have single poles, 
the full expression reads 

U^mc^ 1 



c„ 



new « m#« ^ c„ V C„,^C 



Cm 



1 _ 



(A.2) 

For general types of poles, we can absorb all multi-pole contri- 
butions in a pole order correction factor !P„, so that we can write 
the probability distribution function compactly as 



p(0 = 2 n,e-f/'2^"' n Y] ■ (A.3) 

»=1 mi^ii di 

If the n-th mode belongs to a set of poles of order k, its correction 
factor is 



r(k) 



1 



k-l 



Ifl,) '=1 



^!(2C„)* 

where the outer sum goes over all sets of integers a, that fulfil 
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f C ] 


i 




c„,^c„ 


1 _ ^ 




J 



(A.4) 



A-l 



flo 



(A.5) 



1=1 



This expression was extrapolated from explicit calculation of up 
to quintuple poles. The prefactors A^a.^ obtained in these calcu- 
lations are given in table lA.ll We still have to find a general 
expression for these prefactors, so that we can also give fik) for 
k>5. 
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Table A.l. Pole order correction factors 



k 


(a,) 




1 


0, 0, 0, 0, 


1 


2 


1, 0, 0, 0, 


1 




0, 1, 0, 0, 


-2 


3 


2, 0, 0, 0, 


- 1 




1, 1,0, 0,0 


4 




0, 2, 0, 0, 


-4 




0, 0, 1, 0, 


-4 


4 


3, 0, 0, 0, 


- 1 




2, 1, 0, 0, 


6 




1, 2, 0, 0, 


- 12 




1,0, 1,0,0 


- 12 




0, 3, 0, 0, 


8 




0, 1, 1,0,0 


24 




0, 0, 0, 1, 


16 


5 


4, 0, 0, 0, 


1 




3, 1, 0, 0, 


-8 




2, 2, 0, 0, 


24 




2, 0, 1, 0, 


24 




1, 3, 0, 0, 


-32 




1, 1, 1,0,0 


-64 




0, 4, 0, 0, 


16 




0, 2, 1, 0, 


96 




0, 1,0, 1,0 


128 




0, 0, 2, 0, 


48 




0, 0, 0, 0, 1 


96 



Note added in proof 

Recently, it was brought to our attention (thanks to Stefan 
Hilbert) that a related distribution has long been known in the 
fields of renewal theory and signal processing. In the special 
case that all C„ > 0, the correlation function (Eq. fTTT i simpli- 
fies to a sum of squares of absolute values of complex Gaussian 
random variables, equivalent to a sum of exponential variables, 
and our univariate distribution (Eq. l22t is equivalent to a type 
of generalized Erlangian distribution (see Cox 196^. A more 
generalized, multivariate version is given as 'Theorem 4' in 
[Hammarwall et al. (2008), but this is still limited to positive pa- 
rameters. Therefore, these results do not apply for the case of 
arbitrary signs of the C„ which is needed for our purpose. 
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